keywords:"datasets" - Search Results - Digital Repository

guest :: login Digital Repository
		Search		Submit		Help		About

Home > Search Results: keywords:"datasets"

Search:

Search Tips :: Advanced Search

Search collections:

Sort by:	Display results:	Output format:

	Movie Recommender System Janko, Pavel ; Zbořil, František (referee) ; Šůstek, Martin (advisor) This thesis primarily addresses various methods of constructing a system for movie recommendations. Both basic and advanced techniques required for creating a recommender system are also covered in the thesis. The core of the thesis is designing, implementing and experimenting with a system for movie recommendations based upon the data originating from publicly accessible datasets. In order to predict ratings that the user would give to movies after watching them, the system utilizes a factorization model based on collaborative filtering. This thesis also describes the relation between model hyperparameter configuration and prediction accuracy, experiments that were conducted in order to further improve the model accuracy and finally compares the implemented model with existing solutions. Detailed record
	Improving Consistency in Text Recognition Datasets Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor) This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model. Detailed record
	Network Forensics Tools Survey and Taxonomy Zembjaková, Martina ; Ryšavý, Ondřej (referee) ; Pluskal, Jan (advisor) Táto diplomová práca sa zaoberá prieskumom a taxonómiou sieťových forenzných nástrojov. Popisuje základné informácie o sieťovej forenznej analýze, vrátane procesných modelov, techník a zdrojov dát používaných pri forenznej analýze. Ďalej práca obsahuje prieskum existujúcich taxonómií sieťových forenzných nástrojov vrátane ich porovnania, na ktorý naväzuje prieskum sieťových forenzných nástrojov. Diskutované sieťové nástroje obsahujú okrem nástrojov spomenutých v prieskume taxonómií aj niektoré ďalšie sieťové nástroje. Následne sú v práci detailne popísané a porovnané datasety, ktoré sú podkladom pre analýzu jednotlivými sieťovými nástrojmi. Podľa získaných informácií z vykonaných prieskumov sú navrhnuté časté prípady použitia a nástroje sú demonštrované v rámci popisu jednotlivých prípadov použitia. Na demonštrovanie nástrojov sú okrem verejne dostupných datasetov použité aj novo vytvorené datasety, ktoré sú detailne popísane vo vlastnej kapitole. Na základe získaných informácií je navrhnutá nová taxonómia, ktorá je založená na prípadoch použitia nástrojov na rozdiel od ostatných taxonómií založených na NFAT a NSM nástrojoch, uživateľskom rozhraní, zachytávaní dát, analýze, či type forenznej analýzy. Detailed record
	Detection of Fake News Using Machine Learning Koreň, Matej ; Zbořil, František (referee) ; Hříbek, David (advisor) This thesis focuses on the use of machine learning in fake news detection. For this purpose, four models have been selected – Bayesian, Decision Tree, Support Vector Machine and a Neural Network. In five experiments on various datasets, these models were trained, tested, evaluated and compared with state-of-the-art methods. Final implementation is in the form of a Python package, which allows it’s users to replicate this procedure with their own data. Beyond the assignment, Slovak dataset Dezinfo SK was created. Detailed record
	Acceleration of Face Recognition Algorithm with Neural Compute Stick 2 Mičánková, Eva ; Beran, Jan (referee) ; Goldmann, Tomáš (advisor) This thesis focuses on the issue of facial recognition in a face image using neural networks and its acceleration. It provides an overview of previously used techniques and addresses the use of currently dominant convolutional neural networks to solve this issue. The work also focuses on acceleration mechanisms that can be used in this area. Based on the knowledge of the issue, a system based on the concept of edge computing was created, which can be used as a home security system connected to an IP camera, which sends a notification about the presence of an unknown person in a guarded area. Detailed record
	Data Sets for Network Security Setinský, Jiří ; Hranický, Radek (referee) ; Tisovčík, Peter (advisor) In network security, machine learning techniques are used to effectively detect anomalies and malware in network traffic. A quality dataset is needed to train a network classifier with high accuracy. The aim of this paper is to modify the dataset using machine learning techniques to improve the quality of the dataset which will lead to training the model with a higher accuracy. The dataset is analyzed by a clustering algorithm and each cluster is characterized by a statistical description resulting from the attributes of the input dataset. The statistical description along with the information of the original classifier is used to compute the score. The score serves as a weight in the modification phase. Cluster analysis allows to filter out the data that are important for training the final model. The proposed approach allows us to mitigate the redundancy of the dataset or to augment it with missing data. The result is a modification framework that is able to reduce the datasets or perform their aggregation in order to create a compact dataset that reflects the actual network traffic. Models were trained on the created datasets and achieved higher accuracy compared to the existing solution. Detailed record
	Model of Cycling Traffic Intensity in Brno Eliáš, Radoslav ; Burget, Radek (referee) ; Hynek, Jiří (advisor) Oddelenie dát v Brne má prístup k viacerým dátovým sadám o počtoch cyklistov. Cieľom práce bolo vytvoriť model integrujúci tieto zdroje pre odbor dopravy magistrátu mesta, aby získali prehľad o tom, ako sa infraštruktúra denne využíva. Každý súbor údajov je agregovaný na inú základnú mapu s mierne odlišnou sieťou ulíc. Táto práca predstavuje algoritmický prístup k porovnávaniu ulíc na základe podobnosti, percentuálneho prekrytia a dalších parametrov. Poskytnuté sú dva algoritmy na porovnávanie geometrie založenej na bodoch a úseckách geometrie. Rovnako aj model priraďujúci lokácie medzi rôznymi dátovými sadami a informačný panel vizualizujúci hodnoty z nich vedľa seba. Robustnosť algoritmov umožnuje ich použitie v akejkoľvek geografickej aplikácii využívajúcej priestorové údaje. Informačný panel poskytuje užitocné informácie o cyklistickej doprave pre bežných používatelov aj odborníkov, ktorí navrhujú infraštruktúru mesta Brna. Detailed record
	Relation Extraction from Text Královič, Kristián ; Ondřej, Karel (referee) ; Smrž, Pavel (advisor) This bachelor thesis focuses on the extraction of semantic relations between named entities in natural text using learning with a small number of supporting examples. The theoretical part of the thesis introduces methods for natural language representation using dense vectors and named entity recognition. Next, deep learning based approaches for semantic relation extraction are described. The theoretical part also includes a description of learning with a small number of training examples in the context of semantic relation extraction In the implementation part, a system for extracting semantic relations from text has been proposed. The system uses pairwise classifiers based on pre-trained language models like transformers to classify the relations. For the purpose of this work, the ELECTRA-PAIR, RoBERTa-PAIR and BERT-PAIR models were trained. In the experimental part of the thesis, these models are evaluated over different datasets. The experimental part also includes experiments aimed at classifying more complex semantic relations. Detailed record
	Improving Consistency in Text Recognition Datasets Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor) This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model. Detailed record
	Network Forensics Tools Survey and Taxonomy Zembjaková, Martina ; Ryšavý, Ondřej (referee) ; Pluskal, Jan (advisor) Táto diplomová práca sa zaoberá prieskumom a taxonómiou sieťových forenzných nástrojov. Popisuje základné informácie o sieťovej forenznej analýze, vrátane procesných modelov, techník a zdrojov dát používaných pri forenznej analýze. Ďalej práca obsahuje prieskum existujúcich taxonómií sieťových forenzných nástrojov vrátane ich porovnania, na ktorý naväzuje prieskum sieťových forenzných nástrojov. Diskutované sieťové nástroje obsahujú okrem nástrojov spomenutých v prieskume taxonómií aj niektoré ďalšie sieťové nástroje. Následne sú v práci detailne popísané a porovnané datasety, ktoré sú podkladom pre analýzu jednotlivými sieťovými nástrojmi. Podľa získaných informácií z vykonaných prieskumov sú navrhnuté časté prípady použitia a nástroje sú demonštrované v rámci popisu jednotlivých prípadov použitia. Na demonštrovanie nástrojov sú okrem verejne dostupných datasetov použité aj novo vytvorené datasety, ktoré sú detailne popísane vo vlastnej kapitole. Na základe získaných informácií je navrhnutá nová taxonómia, ktorá je založená na prípadoch použitia nástrojov na rozdiel od ostatných taxonómií založených na NFAT a NSM nástrojoch, uživateľskom rozhraní, zachytávaní dát, analýze, či type forenznej analýzy. Detailed record

Interested in being notified about new results for this query?
Subscribe to the RSS feed.

Digital Repository :: :: :: ::
Powered by v1.1.2
Maintained by

This site is also available in the following languages:
Česky English